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SUMMARY 


The report is an Introduction to an important but relatively neglected 
aspect of regression theory which deals with near linear dependency in mea- 
sured data, called collinearity. Several ways for detection and assessment of 
collinearity are discussed. Because data collinearity usually results in poor 
least squares estimates, two estimation techniques which can limit a damaging 
effect of collinearity are presented. These two techniques, the principal 
components regression and mixed estimation, belong to a class of biased esti- 
mation techniques. 

Data collinearity detection and assessment, and the two biased estimation 
techniques are demonstrated in two examples using flight test data from longi- 
tudinal maneuvers of an experimental aircraft. The elgensystem analysis and 
parameter variance decomposition appeared to be a promising tool for collin- 
earity evaluation. The biased estimators had far better accuracy than the 
results from the ordinary least squares technique. 
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rank of X matrix 
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standard error 
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iii 



condition number 


< 

A 

A 

P 

5 

"kj 

P 

2 

o 
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matrix of eigenvalues 
eigenvalue of X^X 
singular value of X 
sensitivity 

variance proportion of jth regression coefficient 
associated with kth component of its decomposition 

air density, kg/nr 

variance 

least squared estimate 
biased estimate 
standardized regressors 
scaled regressors 
derivative with respect to time 


Matrix Exponents: 

T transpose matrix 

-1 inverse matrix 


Abbreviations : 

LS least squares 

ME mixed estimation 

PC principal components 

SVD singular value decomposition 
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INTRODUCTION 


Recently, the introduction of highly maneuverable and often inherently 
unstable aircraft has been presenting new challenges to aircraft identifi- 
cation and parameter estimation* These new aircraft may have more control 
surfaces than conventional aircraft which are moved through a flight control 
system. Such a system can introduce a close relationship between the deflec- 
tions of various surfaces and at the same time can preclude maneuvers suitable 
for system identification. These characteristics can be reflected in an 
inability to estimate the effectiveness of individual control surfaces and to 
obtain accurate estimates of the remaining parameters. One of the reasons for 
these problems is related to the near linear relationship among several vari- 
ables entering model for various parameter estimation techniques. 

Near linear dependency among variables in linear regression, often called 
collinearity, has been studied by many statisticians. An introduction to the 
problem of collinearity is presented in ref. 1. The purpose of this report is 
to briefly discuss the collinearity in a general model for linear regression, 
detection of collinearity and its remedy. Two methods of dealing with colli- 
near data, the principal components regression and mixed estimation, are 
presented. They are based on an extension of the ordinary least squares 
technique. The report is concluded by two examples with real flight data. In 
these examples the detection of collinearity and application of estimation 
techniques described is demonstrated. 
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COLLINEARITY 


The linear regression model can be formulated as 

y - e 0 + e 1 x 1 + • • • + Vn (I) 

where Xj, j = 1, 2, . . n, are the regressors, y is a dependent variable 

and 0_, 0, , . . 0 are the unknown parameters. After substituting measured 

0 1 n 

values into (1) the regression equation has the form 

Y » X0 + e (2) 

where Y is an (N x 1) and 0 is (n + 1 x 1) vector, e is an (N x 1) vector of 
measurement noise and X is the (N x n + 1) matrix of regressors and ones 


X - 


1 x n x 21 • - - • *nl 

1 x 12 x 22 • • • • *n2 


X 1N X 2N * * * ’SiN 


with N indicating the number of data points. The least squares estimates of 
unknown parameters are obtained from 

0 LS - (X T X ) X T Y (3) 

For further discussion and analysis it will be more convenient to deal 
with regressor variables which have been standardized (centered and scaled to 

unit length), see Appendix A. There, the matrix X* T X* is the (n x n) matrix 
matrix of correlations because the off-diagonal elements of this matrix are 
quite often referred to as correlation coefficients, although the regressors 
are not necessarily random variables. Denoting X*j, j ■ 1, 2, . . .,n, as the 

"olumns of the X* matrix with centered and scaled regressors, the matrix X* 
can be expressed as 


X* = [X*!, X* 2 , . . X* n ] 


(4) 
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If X* .X* =0, j * k, the regressors are orthogonal and the X* A X* matrix is a 

J 

diagonal matrix. The vectors X*j, X*£, . . X* n are called linearly depen- 
dent if there is a set of constants, kj, not all zero, such that 


E k.X* 
j«l J 2 


(5) 


-1 


Then, the rank of X*^X* is less than n and (X* X*) does not exist. 


In many practical applications of linear regression eq. (5) is only 
approximately true. This indicates near linear dependency in X* and the 

problem of collinearity exists. In such case X*^X* is called ill conditioned. 
Because of that collinearity can cause computational problems and reduce the 
accuracy of estimates. Thus in the context of linear regression, collinearity 
is a data problem, not a statistical phenomenon. 


There are at least three different sources of collinearity, namely, 

a) design of an experiment, 

b) constraints in the data, 

c) model specification. 


If the model is designed in such a way that the resulting data is speci- 
fied mostly on a subspace of the region defined approximately by (5), then 
collinearity might occur. This type of problem can arise during the test of a 
dynamical system where one or more variables representing regressors in (1) 
were not sufficiently excited. The constraints in the data could be caused by 
an inherent property of the system under test. For example, an aircraft 
stability augmentation system can deflect various control surfaces in concert 
thus causing near linear dependence among their deflections. Finally, to 
avoid collinearity, a specified model should not be over parameterized. For 

2 

example, it should not include nonlinear terms, such as x^, or x^X 2 » if x^ is 
small. 


The presence of collinearity usually results in various unwanted proper- 
ties of the least squares estimates of unknown parameters. Two of them, 
illustrated in ref. 1, include too large absolute values for parameter esti- 
mates and their large variances and covariances. 
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DETECTION AND ASSESSMENT OF COLL I NEAR I TY 


Many procedures have been employed to detect collinearity . They are 
discussed in ref. 1 and ref. 2. In this report only three of them will be 
considered and their use later demonstrated in examples. These procedures 
are: 

1. Examination of the correlation matrix and its inverse. 


This is the simplest and more straight forward procedure. High correla- 
tion coefficient between two regressors can point to a possible collinearity 
problem. The absence of high correlation, however, cannot be viewed as 
evidence of no problem. The correlation matrix is unable to reveal the pre- 
sence of several coexisting near dependencies among the regressors, as demon- 
strated in ref. 1. Because of the shortcoming mentioned in regard to the use 

of X* T X* as a diagnostic measure of collinearity, the usefulness of its 

T -1 

inverse is also limited. The diagonal elements (X* X*) are often called the 
variance inflation factors, VIFj, and they can be expressed as 


VIF j " 



( 6 ) 


where is the squared multiple correlation coefficient of x. regressed on 
j ^ 

the remaining regressors (see ref. 1 and Appendix B for the development of 

R ). The term "variance inflation factor" reflects its relationship with the 

jth parameter variance o (0^). As shown in ref. 3 


° 2< V 


2 

a 


T 

X* X* 
J J 


VIF j 


(7) 


The diagnostic value of VIF follows from expression (6). Large value of VIF 

2 

indicates an near unity and hence points to collinearity. The weakness of 

this diagnostic measure is in its inability to distinguish among several 
coexisting near dependencies and in its lack of meaningfull boundaries for 
values of VIF. 


2. Eigensystem Analysis and Singular Value Decomposition 
T 

The matrix X A X can be decomposed as 

X T X = TAT T (8) 

where A is an (n x n) diagonal matrix whose diagonal elements are the eigen- 
values \y j = 1, 2, . . . , n, of X A X, and T is an (n x n) orthogonal matrix 


i 
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whose columns are the eigenvectors of X^X. The eigenvalues close to zero 
Indicate near linear dependency In the data. The elements of the correspond- 
ing eigenvectors could reveal the nature of this dependency. Collinearity is, 
therefore, indicated by the presence of a "small" eigenvalue. Unfortunately 
there is no specification what "small" is. In order to avoid this problem 
some authors are using the condition number of X T X defined as 


K 


i 


x 

max 



1 . 2 , 


n 


(9) 


Then, they consider the condition number exceeding 1000 as an indication of 
severe collinearity (see ref. 1). 


In ref. 2 an approach using singular-value decomposition for diagnosing 
collinearity is recommended. It is based on the decomposition of matrix X as 

X « UDT T (10) 


where U is a (N x n) matrix and U^U ■ T^T ■ I. The matrix D is an (n x n) 
diagonal matrix with nonnegative diagonal elements , j ■ 1, 2, . . ., n, 

which are called the singular values of X. The singular-value decomposition 
(SVD) is closely related to the concept of enigenvalues and eigenvectors, 
since from (8) and (10) 


X T X 


TD 2 T T 


TAT' 


(ID 


2 T 

The diagonal elements of are therefore the eigenvalues of XX and the 

T 

columns of U are the eigenvectors of X A X associated with its n nonzero eigen- 
values. The degree of ill conditioning depends on how small the singular 
value is relative to the maximum singular value. In this connection a condi- 
tion index of the matrix X is proposed as 


p 

max 



1 . 2 , 


., n 


( 12 ) 


It is further suggested to consider 
ately to strongly collinear data. 




from 30 to 100 as an evidence of moder- 


The SVD of the matrix X provides similar information to that given by the 

eigensystem of X^X. The use of SVD is, however, preferred by many authors 
namely because of greater numerical stability of its computing in comparison 

to that of the eigensystem of X^X. This may be especially true when X^X is 

ill conditioned. 
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3. Parameter Variance Decomposition 

The parameter variance decomposition approach for detecting collinearity 
was proposed in ref. 2. It follows from the covariance matrix of parameter 

a 

estimates 0 which is obtained as 


Cov (0) = a 2 (X T X) _1 = a 2 TA -1 T T 


(13) 


The variance of each parameter is equal to 


2 * 

» ‘V 





(14) 


where tj^ are the elements of eigenvector tj associated with • Eq. (14) 

decomposes the variance of each parameter into a sum of components, each 

corresponding to one and only one of the n singular values jj ^ • In (14) the 

singular values appear in denominator, so one or more small singular values 

can substantially increase the variance of This means that an unusually 

high proportion of the variance of two or more coefficients for the same small 
singular value can provide evidence that the corresponding near dependency is 
causing problems. Introducing 




ik 

A 


and 


n 

- Z 
k=l 


jk 


the j , k variance-decomposition proportion as the proportion of the variance of 
the jth regression coefficient associated with the kth components of its 
decomposition in (14) is given as 


+jk 


kj ♦. 


j, k = 1, 2, 


(15) 


Since two or more regressors are required to create near dependency, then two 
or more variances will be adversely affected by high variance-decomposition 
roportions associated with a single singular value. Variance-decomposition 
proportions greater than 0.5 are recommended in ref. 2 as a guidance for 
possible collinearity problems. It is also suggested that the columns of X 
should be scaled to unit length but not centered. Thus the role of the bias 
term in near-linear dependencies can be diagnosed. 
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SENSITIVITY ANALYSIS 


As was mentioned earlier, the design of an experiment and constraint in 
the data can contribute to data collinearity. Both of these phenomena may 
also influence parameter identif lability resulting in limited accuracy of 
their estimates. One of the possible ways to assess parameter identif lability 
is based on the sensitivity analysis. For the regression model 


Y - X0 + e 


( 2 ) 


the sensitivity of the dependent variables to the changes in parameter 0., 
keeping the remaining parameters fixed, is given as ^ 


»Y _ x 30_ 
30 j 90J 


(16) 


Then, the measure of sensitivity 


for the parameter 0 


j 


can be defined as 


0 2 (il 

j V 3G 


j 


J , 9Y .2 .30 * „T .30 N n 2 Y T Y 

> ( -30j } - Q j ( ^0j } X X ( *30j } “ °j X j X j 


(17) 


For practical computing of the sensitivities the values of parameters in 
the regression model must be known. Because the parameter values are the 
subject of estimation, the question can arise what values for 0^ should be 

used in computing 5^. The least squares estimates using data with strong 

collinearity and/or low parameter sensitivity could be highly unstable thus 

causing distortions in the computed values of In these cases more stable 

estimates or priori values for parameters should be used. 


BIASED ESTIMATION TECHNIQUES 

There are several ways on how to deal with the problem of collinearity. 
They include a collection of additional data, redesign of an experiment, model 
respecification, and use of different estimation techniques from the ordinary 
least squares procedure. This report will address only the last possibility 
mentioned. 

As discussed previously, the application of the ordinary least squares 
technique to the set of data with collinearity problems can result in large 
estimated values for parameters and large values for their covariances. The 
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least squares technique provides an unbiased linear estimator which, according 
to Gauss-Markof f theorem (see for example ref. 4), has minimum variance in the 
class of unbiased linear estimators. There is no guarantee, however, that 
this variance will be small. Figure 1 illustrates a situation of two distri- 
butions of a parameter estimate. One estimate, 0, is unbiased (a possible 

result of least squares technique), the other, 0, is biased (obtained by a 

A 

biased estimation technique). In the first case the variance of 0 is large, 

A 

indicating a large confidence interval on 0 and unstable point estimate 0. 

In the second case the estimate 0 is subjected to bias error, E (0) - 0 , but 
much smaller variance. The resulting mean square error of the estimator 0 is 

MSE (0) = E (0 - 0) 2 = o 2 (0) + [E (0) - 0] 2 (18) 


It is possible that for small bias error the MSE (0) could be smaller than the 

2 A 

variance of the least squares estimator a (0)* This possibility has inspired 
a development of various biased estimation techniques. Two of them, the 
principal components regression and mixed estimation, will be described and 
applied to experimental data. 


PRINCIPAL COMPONENTS REGRESSION 

The development of principal components estimator starts by transforming 
the original regressors Xj, j = 1, 2, . • . , n to the space of orthogonal 

regressor zy This transformation is accomplished by introducing 



Z = XT 

(19) 

and 

where 

0 * Ty 

T T X T XT = A and T T T = TT T = I 

(20) 


8 



Using (19) and (20) the regression model (2) becomes 


Y = Z y + £ (21) 

with the LS estimator of y as 

Y = A _1 Z T Y (22) 


The columns of Z which define a new set of orthogonal regressors are referred 
to as principal components. 

To obtain principal components estimator the regressors in (21) are 
arranged in order of decreasing eigenvalues 

A, > 1a ^ ^ A 

12 n 

Then, the principal component estimator is given by a vector where the first r 
components agree with y and remaining s = n - r components are zero 


T PC 


"|v 


, Y r , 0 . . .0 


(23) 


The LS estimates in (22) can also be obtained as 


Yj - » t*X Y, j - 1, 2, 


(24) 


where tj is the jth column of the eigenvector matrix T. By comparing (23) and 

(24) it follows that the estimates Yp^ are obtained by setting s = n - r small 

eigenvalues to zero which is equivalent to assuming that the matrix X has rank 
r < n* 


The principal components estimates of parameters associated with the 
original regressors Xj are obtained from (20) and (23) as 


0 


PC 




(25) 
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In order to find the expression for the bias in principal components estimates 
and variance of these estimates the eigenvlaue and eigenvector matrices are 
partitioned as 



T = [T r T s ] 


where A and A are diagonal matrices containing the eigenvalues associated 
r s 

with the retained and eliminated principal components respectively. For the 
eigenvectors, T r and T g are similarly partitioned. The LS estimator of the 
parameters y that are retained is 


* T T 

y - A t Vy 

r r r 


From (24), (25) and (26) it follows that 


1 -ITT 

9 PC = Vr " £ X Yt j 

The expected value of the PC estimator is 


E «W 


Tv = T T Q 
r r r r 


Since 


TT T = I 


T T 

T T + T T 
r r s s 


E < 0 PC> 


|I - IT ] 0 
s s 


= 0 - T y 
s s 


(26) 


(27) 


Thus the PC estimates of the n parameters 0 are biased by the quantity T y . 

^ s 

2 

Assuming that e has zero mean and variance a I, the covariance matrix of 
the LS estimates of 0 is given as 


- 2 T -1 2 -1 T 

Cov (0) = o (X A X) - o TA V 

2 -IT -IT 

= a (T A T l + T A T 1 ) 
r r r s s s 


(28) 
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The form of the covariance matrix for the PC estimates of 0 follows from (13), 
(14) and (27) as 


Cov (0 ) 

pc 


2 -IT 
or T A V 
r r r 


(29) 


+ o 


j=l 


-1 T 
A. t .t . 
J J J 


The comparison of (28) and (29) reveals that the elimination o£ principal 
components will result in a decrease in the variance of parameters 0 PP . The 

-1 T 

diagonal elements of the matrix TAT are weighted sums of the inverse of 

s s s 

the eigenvalues associated with the eliminated principal components* If these 
eigenvalues are small a substantial reduction in the variance of the PC esti- 
mates can be expected* 


In practical application of the PC regression the problem of how many or 
if any principal components should be eliminated may arise. The answer to 
this problem could come from the diagnostic measures discussed and from 

2 2 

commonly used least squares criteria as s , R and others* Reference 5 pre- 
sents the way for finding an optimal value of r based on the minimization of 
the criterion 


(Y 


PC 




- Y> 


In the same reference it is also pointed out that the assumption of an 
integral rank for X can be too restrictive. A possible improvement to the 
principal components estimator, known as the fractional rank estimator, is 
introduced. If the rank of X lies in the interval (r, r + 1), the fractional 
rank estimator is given as 

**» I A A A (S iX 

y FR = Ml’ ^2* • * *» c ^r+l* 0 • • • U J 

where the criterion for choosing c is given. 

The second problem with the PC regression can be related to the formu- 
Lation of regressors in equation (2). In numerical computation the measured 
data can enter the analysis in various forms. Probably the simplest approach 
would be to use regressors in their original form. If the scaled regressors 
to their unit length are preferred, different principal components will be 

T 

obtained. The use of standardized regressors will result in X X being a 
matrix of correlations and the principal components will be changed again. 
This dependence of PC estimates on the form of regressors is obviously a 
weakness of this estimation technique. 
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MIXED ESTIMATION 


The mixed estimation was developed as a Bayes-like technique by augment- 
ing the measured data by prior information. For the linear model 

Y « XO + e (2) 


with F.(e) and E (e 2 ) = ct 2 I 

it is assumed that p < n prior restrictions on the elements of 0 are avail- 
able. These restrictions are formulated as 

a ® AO + £ (30) 

In (30) A is a matrix of each p < n which includes known constants, a is a 
p-vector of values which can be specified, and 5 is a random vector with 


E(c) = 0, E( £e) * 0 and E(^ 2 ) • o 2 W 
value W is a known weighting matrix. 

Combining (2) and (30) the mixed model is given as 


- 


- 


- 

Y 

St 

X 

0 + 

c 

a 


A 


K 

. 


- 


- - 


For known a 2 the application of least squares to (31) results in mixed 
estimation 


(31) 


~ T T — I — I T T - 1 

0^ = (X X + A W A) (X Y + A W a) 
ME 


(32) 


Introducing the augmented variables Y fl , X a , and e the mixed model can be also 
written as 


a a 3 


2 2 

where, E (e ) 53 0 and E (e ) - a 
a a 


1 0 

0 w 


= a 2 w 


(33) 
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Then, the mixed estimator can be expressed in the form 


- T -1 -IT -1 

0 = (X A W X ) X U Y 

ML a a a a a a 


( 34 ) 


It follows from the Gauss -Markoff theorem that 0^ given by (34) or (32) is an 
optimal unbiased linear estimator of 0. 

In real application of the mixed estimation the a priori information is 
usually not known exactly. In this case 

a * AO + b + 5 (35) 

where b * 0 is an unknown vector. The mixed estimator corresponding to the 
condition given by (35) will be called 0 ^. The expected value of is 

obtained by substituting ( 2 ) and (35) into (32) 


E (0 ) = E [M-Wo + M - 1 X X e + M - 1 A T W -1 Ao 

Mb 

+ M - 1 A T W -1 b + M - 1 A T W -i £ j 
= 0 + M“ 1 A T W* 1 b 
where M = X T X + A T W - 1 A. 


The estimate 0 M#1 is 
Mb 

The covariance 


— 1 T — 1 

therefore biased by the quantity M AW b. 
matrix of the mixed estimator is 


Cov 


^ ' ME ^ 


V 1 

o M 


(36) 


(37) 


(38) 


The difference between the covariance of the LS and ME is given as 

Cov (0) - Cov ( 0 UC ,) = a 2 (X T X )' 1 - a 2 (X T X + A T W _l A ) -1 (39) 


Since Cov * (t)^, ) - Cov * (o) * o^(A^W *A) > 0, the right side of (39) is a 

nonnegative definite matrix. This means that the addition of the priori 
information to the ordinary regression will result in reduction of variance of 
the LS estimates. 
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The restrictions on parameters given by (30) can take several forms. The 
most common are: 


a) a separate estimate of all parameters Oj, j = 1, 2, . . . , n 

exists. If these estimates are called 0 q, then (30) is changed as 

0 0 - 0 + c 

which means that a = 0^ and A is the m x n identity matrix. If only 

the estimates of some elements of the vector 0 are known a priori, 
say 0 q then 



+ Z 


and A «* [I, 0] where the dimension of I in A corresponds to the 
dimension of 0^. 

h) Special case occurs when W = 0. This corresponds to knowing that 

a * A0 with certainty. This situation leads to a piecewise regres- 
sion discussed in ref. 6. 

c) Sometimes the a priori information is given as a statement that 

particular parameters lie in a certain region (d m ^ n , d max ). For the 
parameter 0. it means that 


3 n - 1/2 (d , + d ) + c . 

0,j min max j 


EXAMPLES 


The detection of collinearity and the biased estimation techniques 
described are demonstrated in two examples. The flight test data for these 
examples were obtained from the longitudinal small-amplitude maneuvers of a 
hLghly augmented, inherently unstable research aircraft. The longitudinal 
motion of this aircraft was controlled by three surfaces, canard, flaperons 
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and strake, moved by an automatic control system. The data used In the analy- 
sis were Ln the form of sampled time histories of open-loop input variables 

6^, and 6^ and output variables of V, ot, q, a. £ and q. The model for the 

vertical-force and pitching-moment coefficient was formulated as 


+ C a + 
a 

a 


c S£ 

a 2V 

q 


+ C 6 + C d 

a _ s a _ . 
6s 6f 


6c 


(40) 


for a ■ Z or m. In (40) the regressors are represented by the increments of 
the input and output variables from their values in steady flight conditions 
prior to the excited motion. The independent variables in (40) were computed 
from the expressions 


C 7 = Z 2- a 
qS Z 



The unknown parameters in (39) are the stability and control derivatives, and 

the bias term . 

a 0 

Example 1, Three control variables: 

The aircraft short-period response to a series of commanded pitch doub- 
lets is illustrated in figure 2. Shown are time histories of three longitu- 
dinal control and three output variables. Inspection of figure 2 reveals very 
close relationship among all three open-loop inputs, thus indicating strong 
possibility for data collinearity . For the assessment of collinearity the 
correlation matrix of standardized regressors was formulated and Its deter- 
minant computed. The correlation matrix is shown in Table I. By examining 
this matrix the simple correlation greater than .80 between two pairs of 

regressors (6 , a) and (6,6) can be seen. The determinant value was found 

c c s 

equal to 0.00106. Therefore, the high pairwise correlations and the low value 
of the determinant point out data collinearity. Because of the weakness of 
the VI F as a diagnostic measure its values are not given. 

In order to decide which regressors are affected by collinearity the 
variance proportions were computed. They are presented in Table II for scaled 

regressors in (40). Also Included in the table are the eigenvalues of the X^X 

matrix and condition numbers. The variance proportions corresponding to the 
largest condition number Indicate four damaging dependencies involving 
6 c , 6 g , q and the bias term. The second dependency involves a and 6^. It 

corresponds to the condition number < = 36 which may be considered too small 
for having any serious effect on the estimates. 
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As the result of data collinearity assessment it was decided to use the 

T 

principal components regression with the smallest eigenvalue of XX equal to 

zero, thus reducing rank of the X matrix by one (r = 5) . For the mixed esti- 
mation the parameters C„ and C were set at their wind-tunnel values with 

6s 6s 

the uncerta tnties estimated from repeated measurements in different facilities 
and for different configurations. The selection of strake terms was based on 
small sensitivity of these parameters and expected sufficient accuracy of 
their a priori values. The a priori values and three different values of 
their variance used in the mixed estimation are given in Table III. 

In Table IV the results of the least squares, principal components and 
mixed estimation of parameters in the equation for Cy are summarized. Pre- 
sented are the mean values and standard errors of parameter estimates, and the 
standard errors of the Cy estimates using the residuals. Also included are 

the sensitivities computed for wind-tunnel values given in the last column of 
the table, and the increments of the squared multiple correlation coefficient 
due to regressors in (40). Both the sensitivity analysis and squared multiple 
correlation coefficients indicate that the only important term in the equation 
for C 7 is C„ a. This term in combination with C 7 , explain 99% of variation 
a 

In the measured data. It can be, therefore, expected that data collinearity 
combined with low sensitivities will cause severe identif lability problems for 
most of the other parameters. These problems are immediately apparent from 
the LS estimates of C and C which are much higher than that from the wind 

6s q 

tunnel and theory respectively. 

The principal components regression was first applied to scaled data. 

The results show no improvement over the LS results. When the original 
regressors were used, however, the parameters and came out with 

6s J 5c 

correct sign. There was a small increase in but substantial decrease in 

"a 

0 y . The fit to the data, measured by the standard error of Cy , deterio- 

rated. No explanation has been found for the differences between the two sets 
of principal components estimates. The mixed estimation with moderate and 
tight restrictions on the a priori value gave the best sets of estimates when 
compared with the wind-tunnel data and results of the two previous techniques. 

The results from the data governed by the pitching-moment equation are 

presented in Table V. In the model for C m two terms, C 6 and C a, are 

ro m _ c m 

6c a 

*. v ortant. Together they explain 97% of the variation in the data. The 

principal components regression with the original regressors improves the LS 

estimates of C and C and makes them consistent with the mixed estimates 
m m r 

6c 6s 

under moderate or tight restrictions. A serious problem with the principal 
components regression is the non-physical value for the parameter C m • 
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Example 2. Two control variables: 


The second maneuver was commanded by two pitch doublets. This time the 

control system held the flaperons at constant deflection. The time histories 

of input variables 6 and 6 , and output variables a, q, and a are plotted in 

c s “ 

figure 3. The correlation matrix is given in Table VI showing strong correla- 
tion between (a, 6 ) and (6,6)* The value of the determinant was equal to 
s sc 

T 

0.0100. The various proportions, eigenvalues of XX and condition numbers are 

presented in Table VIII. The damaging dependencies for the largest condition 

number are again among the regressors 6 , 6 , q and the bias term. 

s c 

Table VIII and IX present the estimation results, sensitivities, squared 
multiple correlation coefficients and wind-tunnel data used In the mixed esti- 
mation (values for C and C ) and for comparison. The principal compo- 

A r m « 

6s 6s 

nents regression using the original data with r = 4 did not bring the expected 

improvement over the LS estimates. Therefore further reduction in rank of the 

X matrix was attempted. The following estimates of the important parameters 

C , C and C were much closer to those from the mixed estimation. By 
/*m m r J 

a a 6c 

observing the results for r = 3 and r = 4, and the results from the mixed 
estimation, it is possible to argue that the optimum r should be somewhere 
between 3 and 4. Such selection of r would, however, lead to the fraction 
rank estimator mentioned earlier but not developed in this report. 

As in the previous example, the mixed estimation with moderate or tight 
restrictions (see Table III) resulted in the best sets of estimates. The 
technique only failed to provide some physical values for the damping terms 
C y and C . This failure could be explained by very low sensitivities of 
q q 

these parameters . 


CONCLUDING REMARKS 

Near linear dependency in measured data, called collinearity , and its 
effect on linear regression were briefly discussed. Then, procedures for 
detection and assessment of collinearity were presented. They included the 
valuation of the correlation matrix and its Inverse, eigensystem analysis or 
singular value decomposition, and parameter variance decomposition. The first 
of these procedures is relatively simple and straight forward but, it cannot 
reveal the presence of several coexisting dependencies among the regressors. 
KIgensystem analysis examines the values of eigenvalues In the matrix composed 
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by regressors* The large condition numbers serve as indicators of data colli- 
nearity. The singular value decomposition provides similar information. It is 
preferred by some analysts because of greater stability in its computation. 
Both approaches become more effective when combined with parameter variance 
decomposition. This combination can find which regressors are near linearly 
dependent and indicates what action should be taken in order to lessen the 
effect of colllnearity on the estimates. In connection with data collinearlty 
the problem of parameter identification was also addressed and the sensitivity 
analysis as a tool for its assessment introduced. 

One way of dealing with collinearlty is to use different estimation tech- 
niques from the ordinary least squares. This report explained the reasons for 
using the biased estimation techniques and presented two of these techniques, 
principal components regression and mixed estimation. The principal compo- 
nents regression eliminates the effect of small eigenvalues by reducing rank 
of the matrix of regressors. The weakness of this technique can be seen in 
the restriction to an integral rank and the dependence of the estimates on 
various forms of the regressors (original, scaled or standardized). The mixed 
estimation is a Bayes-like technique which is applied to measured data aug- 
mented by prior information. This estimation procedure can be very successful 
provided that a priori values of selected parameters are known with reasonable 
accuracy. 

The detection and assessment of collinearlty, and the two biased esti- 
mation techniques were demonstrated in two examples using flight data from 
longitudinal maneuvers of an experimental aircraft. In these examples the 
correlation matrix of regressors indicated the existence of correlation 
between two pairs of regressors. The variance proportions, however, deter- 
mined which regressors were affected by collinearlty. The estimates of 
parameters in the aerodynamic model equations for the vertical-force and 
pitching-moment coefficient were also obtained by the ordinary least 
squares. These results confirmed a damaging effect of collinearlty on the 
estimated values and their standard errors. The principal components regres- 
sion provided substant ially improved estimates with the exception of damping- 
in-pttch derivative. Some further improvement was obtained from mixed esti- 
mation where the a priori values were taken from wind-tunnel data. The para- 
meter estimates were completed by the results of the sensitivity analysis and 
by increments in the squared multiple correlation coefficient indicating the 
Importance of individual terms in regression equations. The proposed pro- 
cedure for dealing with data colllnearity proved that it could become a useful 
approach for estimating parameters of a highly augmented, possibly unstable 
aircraft from flight data. 
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APPENDIX A 


SCALED AND STANDARDIZED REGRESSORS 


In order to have the columns of the X matrix of unit length, 

regressors x- are replaced by scaled regressors x' using the formu 

J 


the original 
la 


x 


/ 

j± 




(A.l) 


for j = 1, 2, . . n, and i = 1, 2, . . N. 

Using the scaled regressor the model in (2) is changed as 

Y = X' O' (A. 2) 


where 





- x 12 x 22 * * x n2 

v'N 

_ 1 _ 

- X 1N X 2N X nN 

I'TJ 


(A.3) 


and the new parameters 0. are related to origLnal parameters 
equation ^ 


( : . by 
J 


the 


n . 

J 


0 . 
_JL 


N 

>: 

i=l 


x ji 


(A. 4) 
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The standardized (scaled and centered) regressors are obtained as 


x .. -x . 
x* = Jl - 1 
X ii Sj 


(A. 5) 


1 


N 


where x . = rr E x 

j n j.! ji 


N - 2 

S J ' L l t <X ji' V 


The regression model has the form 
Y = X* 0* 

with the hS estimates 


(A. 6) 


0* = (X* T X*) -1 x* t y 


(A. 7 ) 


where (X* T X*) is the correlation matrix of regressors. The original 
parameters are related to the parameters in (A. 6) as 


')* . 



(A. 8) 


and 


* 0 1 - 

°0 0 0 S { X 1 


0*2 - 

S„ X 2 


0* 

n 

c x 

S i 
n 


(A. 9) 
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APPENDIX B 


SQUARED MULTIPLE CORRELATION COEFFICIENT 
The regression equation with LS estimates is given as 


Y - X 0 + e 


(B.l) 


where it is assumed that the regressors and dependent variable are centered. 
Premultiplying each side of (B.l) by its own transpose results in 


rp *T T * * 'T a 

Y l Y - 0 X X 0 + e e 


(B .2) 


The term e = 0 because the vector of residuals e is orthogonal to each of 

the n columns of X. From (B.2) it can be concluded that a fraction R^ of 
N o 

2 o 

'i. y is accounted for the regressors and that a fraction 1 - R is 
i-1 1 

represented by residuals. Then 


A rp T A 

^2 „ O X X 0 
T 

Y Y 


and 


(B.3) 


- R 2 = V 1 

y t y 
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o 

The R is known as the squared multiple correlation coefficient associated 
with ( B . I ) . This coefficient can be interpreted as a measure of variability 
In y explained by the regression model. 

N 2 

For the regression equation with the bias term 0^ the measure £ y. is 

i-1 

replaced by the sum of squared values taken as deviations from the mean, i.e. 

N - 2 2 
7 (y. - y) . With the new measure the expressions for R will take the 

i = l 

form 


",T„T 


2 _ OX X 0 - Ny 

T - 2 
Y Y - Ny 


- 2 


(B.4) 


- r 2 e e 


1 - R 


T - 2 
Y Y - Ny 


1 


N 


where y = — Z y 


i= 1 
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values 


3.2401 


1 .4295 











. 1 * *' 


I DATA. EXAMPLE 1. 
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■ .33 + c 
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95% 

Implied 

variance 
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variance 

.0016 

CM 

1 

ST 

1 

.0036 

.0004 

- .41, - .25 

.0016 
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- .37, - .29 

.0004 




TABLE IV. - LEAST SQUARES AND BIASED ESTIMATES, AND WIND-TUNNEL DATA FOR VERTICAL-FORCE 
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Mmoretical value; Note: the values in parenthesis are standard errors 




TABLE V. - LEAST SQUARES AND BIASED ESTIMATES, AND WIND-TUNNEL DATA FOR PITCHING- 
MOMENT PARAMETERS. EXAMPLE 1. 



ORDINAL PAGE is 
O p POOR QUALITY 


Theoretical value; Note: the values in parenthesis are standard errors 


TAI5LE VI 


. - X* X* MATRIX IN CORRELATION FORM. EXAMPLE 2. 


a 

qc/2V 

{ s 

4 c 

1.000 

.289 

.813 

- .780 


1.000 

.227 

- .159 



1.000 

- .987 




1.000 


TAISLE VI L. - COLL INEARTTY DIAGNOSTIC FOR DATA IN 
EXAMPLE 2. 


Eigen- 
va lues 

Condition 

number 

Variance proportions 
(scaled regressors) 

1 

a 

qc/2V 

6 

s 

6 

c 

2.7482 

1 

.000 

.494 

.000 

.003 

.000 

1 .0916 

3 

.011 

.095 

.000 

.049 

.000 

.9184 

3 

.002 

.003 

.204 

.001 

.000 

.2268 

12 

.063 

.204 

.012 

.031 

i 

i 

.073 

.0150 

184 

.924 

.203 

.784 

.917 

i 

.926 
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TABLE IX. - LEAST SQUARES AND BIASED ESTIMATES, AND WIXD-TUXXEL DATA FOR PITCHING- 
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Theoretical value; Note: the values in parenthesis are standard errors 
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Figure 1. - Distributions of unbiased and biased estimators of 
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